NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Improving Few-Shot Generalization by Exploring and Exploiting Auxiliary Data

Albalak, A.; Raffel, C.A.; Wang, W.Y. (December 2023, Neural Information Processing Systems)

Full Text Available
NeuPSL: Neural Probabilistic Soft Logic

https://doi.org/10.24963/ijcai.2023/461

Pryor, Connor; Dickens, Charles; Augustine, Eriq; Albalak, Alon; Wang, William; Getoor, Lise (August 2023, International Joint Conference on Artificial Intelligence (IJCAI))

Full Text Available
Logic-LM: Empowering Large Language Models with Symbolic Solvers for Faithful Logical Reasoning

https://doi.org/10.18653/v1/2023.findings-emnlp.248

Pan, Liangming; Albalak, Alon; Wang, Xinyi; Wang, William (January 2023, Association for Computational Linguistics)
DataComp-LM: In search of the next generation of training sets for language models

Li, Jeffrey; Fang, Alex; Smyrnis, Georgios; Ivgi, Maor; Jordan, Matt; Gadre, Samir; Bansal, Hritik; Guha, Etash; Keh, Sedrick; Arora, Kushal; et al (April 2025, https://doi.org/10.48550/arXiv.2406.11794)

The authors introduce DataComp for Language Models (DCLM), a testbed for controlled dataset experiments aimed at improving language models. DCLM provides a standardized corpus of 240T tokens extracted from Common Crawl, effective pretraining recipes based on the OpenLM framework, and a broad suite of 53 downstream evaluations. Participants can experiment with dataset curation strategies such as deduplication, filtering, and data mixing at model scales ranging from 412M to 7B parameters. As a baseline, the authors find that model-based filtering is critical for assembling a high-quality training set. Their resulting dataset, DCLM-Baseline, enables training a 7B parameter model from scratch to achieve 64% 5-shot accuracy on MMLU with 2.6T training tokens. This represents a 6.6 percentage point improvement over MAP-Neo (the previous state-of-the-art in open-data LMs), while using 40% less compute. The baseline model is also comparable to Mistral-7B-v0.3 and Llama 3 8B on MMLU (63% and 66%), and performs similarly on an average of 53 NLU tasks, while using 6.6x less compute than Llama 3 8B. These findings emphasize the importance of dataset design for training LMs and establish a foundation for further research on data curation.
more » « less
Free, publicly-accessible full text available April 21, 2026
CausalDialogue: Modeling Utterance-level Causality in Conversations

https://doi.org/10.18653/v1/2023.findings-acl.792

Tuan, Yi-Lin; Albalak, Alon; Xu, Wenda; Saxon, Michael; Pryor, Connor; Getoor, Lise; Wang, William Yang (January 2023, Association for Computational Linguistics)
FETA: A Benchmark for Few-Sample Task Transfer in Open-Domain Dialogue

Albalak, Alon; Yi-Lin Tuan; Pegah Jandaghi; Connor Pryor; Luke Yoffe; Deepak Ramachandran; Lise Getoor; Jay Pujara; and William Yang Wang (November 2022, Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing)

Full Text Available
NeuPSL: Neural Probabilistic Soft Logic

Pryor, Connor; Dickens, Charles Andrew; Augustine, Eriq; Albalak; Alon; Wang; William Yang; Getoor, Lise (June 2022, ArXivorg)

We present Neural Probabilistic Soft Logic (NeuPSL), a novel neuro-symbolic (NeSy) framework that unites state-of-the-art symbolic reasoning with the low-level perception of deep neural networks. To explicitly model the boundary between neural and symbolic representations, we introduce NeSy Energy-Based Models, a general family of energy-based models that combine neural and symbolic reasoning. Using this framework, we show how to seamlessly integrate neural and symbolic parameter learning and inference. We perform an extensive empirical evaluation and show that NeuPSL outperforms existing methods on joint inference and has significantly lower variance in almost all settings.
more » « less
Full Text Available

Search for: All records